Stereoscopy

Stereoscopy (also called stereoscopic or 3-D imaging) refers to a technique for creating or enhancing the illusion of depth in an image by presenting two offset images separately to the left and right eye of the viewer. Both of these 2-D offset images are then combined in the brain to give the perception of 3-D depth. Three strategies have been used to accomplish this: have the viewer wear eyeglasses to combine separate images from two offset sources, have the viewer wear eyeglasses to filter offset images from a single source separated to each eye, or have the lightsource split the images directionally into the viewer's eyes (no glasses required; known as Autostereoscopy). [1]

Contents

Background

Stereoscopy creates the illusion of three-dimensional depth from images on a two-dimensional plane. Human vision uses several cues to determine relative depths in a perceived scene.[2] Some of these cues are:

All the above cues, with the exception of the first two, are present in traditional two-dimensional images such as paintings, photographs, and television. Stereoscopy is the enhancement of the illusion of depth in a photograph, movie, or other two-dimensional image by presenting a slightly different image to each eye, and thereby adding the first of these cues (stereopsis) as well. It is important to note that since all points in the image focus at the same plane regardless of their depth in the original scene, the second cue, focus, is still not duplicated and therefore the illusion of depth is incomplete.

Many 3D displays use this method to convey images. It was first invented by Sir Charles Wheatstone in 1838.[3][4]

Wheatstone originally used his stereoscope (a rather bulky device)[5] with drawings because photography was not yet available, yet his original paper seems to foresee the development of a realistic imaging method[6]:

For the purposes of illustration I have employed only outline figures, for had either shading or colouring been introduced it might be supposed that the effect was wholly or in part due to these circumstances, whereas by leaving them out of consideration no room is left to doubt that the entire effect of relief is owing to the simultaneous perception of the two monocular projections, one on each retina. But if it be required to obtain the most faithful resemblances of real objects, shadowing and colouring may properly be employed to heighten the effects. Careful attention would enable an artist to draw and paint the two component pictures, so as to present to the mind of the observer, in the resultant perception, perfect identity with the object represented. Flowers, crystals, busts, vases, instruments of various kinds, &c., might thus be represented so as not to be distinguished by sight from the real objects themselves.[3]

Stereoscopy is used in photogrammetry and also for entertainment through the production of stereograms. Stereoscopy is useful in viewing images rendered from large multi-dimensional data sets such as are produced by experimental data. An early patent for 3D imaging in cinema and television was granted to physicist Theodor V. Ionescu in 1936. Modern industrial three-dimensional photography may use 3D scanners to detect and record three-dimensional information.[7] The three-dimensional depth information can be reconstructed from two images using a computer by corresponding the pixels in the left and right images (e.g.,[8]). Solving the Correspondence problem in the field of Computer Vision aims to create meaningful depth information from two images.

Etymology

The word stereoscopy derives from the Greek "στερεός" (stereos), "firm, solid"[9] + "σκοπέω" (skopeō), "to look", "to see".[10]

Visual requirements

Anatomically, there are 3 levels of binocular vision required to view stereo images:

  1. Simultaneous perception
  2. Fusion (binocular 'single' vision)
  3. Stereopsis

These functions develop in early childhood. Some people who have strabismus disrupt the development of stereopsis, however orthoptics treatment can be used to improve binocular vision. A person's stereoacuity determines the minimum image disparity they can perceive as depth.

Side-by-side (non-shared viewing scenarios)

Traditional stereoscopic photography consists of creating a 3-D illusion starting from a pair of 2-D images, a stereogram. The easiest way to enhance depth perception in the brain is to provide the eyes of the viewer with two different images, representing two perspectives of the same object, with a minor deviation exactly equal to the perspectives that both eyes naturally receive in binocular vision.

If eyestrain and distortion are to be avoided, each of the two 2-D images preferably should be presented to each eye of the viewer so that any object at infinite distance seen by the viewer should be perceived by that eye while it is oriented straight ahead, the viewer's eyes being neither crossed nor diverging. When the picture contains no object at infinite distance, such as a horizon or a cloud, the pictures should be spaced correspondingly closer together.

The side-by-side method is extremely simple to create, but it can be difficult or uncomfortable to view without optical aids. One such aid for non-crossed images is the modern Pokescope. Traditional stereoscopes such as the Holmes can be used as well. Cross view technique now has the simple Perfect-Chroma cross viewing glasses to facilitate viewing.

Characteristics

Little or no additional image processing is required. Under some circumstances, such as when a pair of images is presented for crossed or diverged eye viewing, no device or additional optical equipment is needed.

The principal advantages of side-by-side viewers is that there is no diminution of brightness so images may be presented at very high resolution and in full spectrum color. The ghosting associated with polarized projection or when color filtering is used is totally eliminated. The images are discretely presented to the eyes and visual center of the brain, with no co-mingling of the views. The recent advent of flat screens and "software stereoscopes" has made larger 3D digital images practical in this side by side mode, which hitherto had been used mainly with paired photos in print form.

Right frame 
Comparison of images for parallel and cross-eye freeviewing.
Right frame 
Another example.

Freeviewing

Freeviewing is viewing a side-by-side image without using a viewer.[11]

Several methods are available to freeview.[12][13]

Stereographic cards and the stereoscope

Two separate images are printed side-by-side. When viewed without a stereoscopic viewer the user is required to force his eyes either to cross, or to diverge, so that the two images appear to be three. Then as each eye sees a different image, the effect of depth is achieved in the central image of the three.

The stereoscope offers several advantages:

Disadvantages of stereo cards, slides or any other hard copy or print are that the two images are likely to receive differing wear, scratches and other decay. This results in stereo artifacts when the images are viewed. These artifacts compete in the mind resulting in a distraction from the 3d effect, eye strain and headaches.

Stereograms cards are frequently used by orthoptists and vision therapists in the treatment of many binocular vision and accommodative disorders.

Transparency viewers

The practice of viewing film-based transparencies in stereo via a viewer dates to at least as early as 1931, when Tru-Vue began to market filmstrips that were fed through a handheld device made from Bakelite. In the 1940s, a modified and miniaturized variation of this technology was introduced as the View-Master. Pairs of stereo views are printed on translucent film which is then mounted around the edge of a cardboard disk, images of each pair being diametrically opposite. A lever is used to move the disk so as to present the next image pair. A series of seven views can thus be seen on each card when it was inserted into the View-Master viewer. These viewers were available in many forms both non-lighted and self-lighted and may still be found today. One type of material presented is children's fairy tale story scenes or brief stories using popular cartoon characters. These use photographs of three dimensional model sets and characters. Another type of material is a series of scenic views associated with some tourist destination, typically sold at gift shops located at the attraction.

Another important development in the late 1940s was the introduction of the Stereo Realist camera and viewer system. Using color slide film, this equipment made stereo photography available to the masses and caused a surge in its popularity. The Stereo Realist and competing products can still be found (in estate sales and elsewhere) and utilized today.

Low-cost folding cardboard viewers with plastic lenses have been used to view images from a sliding card and have been used by computer technical groups as part of annual convention proceedings. These have been supplanted by the DVD recording and display on a television set. By exhibiting moving images of rotating objects a three dimensional effect is obtained through other than stereoscopic means.

An advantage offered by transparency viewing is that a wider field of view may be presented since images, being illuminated from the rear, may be placed much closer to the lenses. Note that with simple viewers the images are limited in size as they must be adjacent and so the field of view is determined by the distance between each lens and its corresponding image.

Good quality wide angle lenses are quite expensive and they are not found in most stereo viewers.

Head-mounted displays

The user typically wears a helmet or glasses with two small LCD or OLED displays with magnifying lenses, one for each eye. The technology can be used to show stereo films, images or games, but it can also be used to create a virtual display. Head-mounted displays may also be coupled with head-tracking devices, allowing the user to "look around" the virtual world by moving their head, eliminating the need for a separate controller. Performing this update quickly enough to avoid inducing nausea in the user requires a great amount of computer image processing. If six axis position sensing (direction and position) is used then wearer may move about within the limitations of the equipment used. Owing to rapid advancements in computer graphics and the continuing miniaturization of video and other equipment these devices are beginning to become available at more reasonable cost.

Head-mounted or wearable glasses may be used to view a see-through image imposed upon the real world view, creating what is called augmented reality. This is done by reflecting the video images through partially reflective mirrors. The real world view is seen through the mirrors' reflective surface. Experimental systems have been used for gaming, where virtual opponents may peek from real windows as a player moves about. This type of system is expected to have wide application in the maintenance of complex systems, as it can give a technician what is effectively "x-ray vision" by combining computer graphics rendering of hidden elements with the technician's natural vision. Additionally, technical data and schematic diagrams may be delivered to this same equipment, eliminating the need to obtain and carry bulky paper documents.

Augmented stereoscopic vision is also expected to have applications in surgery, as it allows the combination of radiographic data (CAT scans and MRI imaging) with the surgeon's vision.

3D viewers

There are two categories of 3D viewer technology, active and passive. Active viewers have electronics which interact with a display.

Active

Liquid crystal shutter glasses

Glasses containing liquid crystal that block or pass light through in synchronization with the images on the computer display, using the concept of alternate-frame sequencing. There have been many examples of shutter glasses over the past few decades, such as SegaScope 3-D glasses for the Sega Master System[14] and the Atari/Tektronix Stereotek 3D system [1], but the Nvidia 3D Vision gaming kit introduced in 2008 introduced this technology to mainstream consumers and PC gamers.[15] See also Time-division multiplexing.

"Red eye" shutterglasses method

The Red Eye Method reduces the ghosting caused by the slow decay of the green and blue P22-type phosphors typically used in conventional CRT monitors. This method relies solely on the red component of the RGB image being displayed, with the green and blue component of the image being suppressed.

Passive

Linearly polarized glasses

To present a stereoscopic motion picture, two images are projected superimposed onto the same screen through orthogonal polarizing filters. It is best to use a silver screen so that polarization is preserved. The projectors can receive their outputs from a computer with a dual-head graphics card. The viewer wears low-cost eyeglasses which also contain a pair of orthogonal polarizing filters. As each filter only passes light which is similarly polarized and blocks the orthogonally polarized light, each eye only sees one of the images, and the effect is achieved. Linearly polarized glasses require the viewer to keep his head level, as tilting of the viewing filters will cause the images of the left and right channels to bleed over to the opposite channel—along with misaligning the vision fields with those of the viewer's eyes. Therefore, viewers learn very quickly not to tilt their heads. As the motion picture provides the same stereoscopic image to all, and no head tracking is involved, several people can view the movie at the same time from a limited breadth of angles.

Circularly polarized glasses

To present a stereoscopic motion picture, two images are projected superimposed onto the same screen through circular polarizing filters of opposite handedness. The viewer wears low-cost eyeglasses which contain a pair of analyzing filters (circular polarizers mounted in reverse) of opposite handedness. Light that is left-circularly polarized is extinguished by the right-handed analyzer, while right-circularly polarized light is extinguished by the left-handed analyzer. The result is similar to that of steroscopic viewing using linearly polarized glasses, except the viewer can tilt his or her head and still maintain left/right separation (though the tilt will still affect the brain's ability to fuse the two images and correctly perceive depth).

The RealD Cinema system uses an electronically driven circular polarizer, mounted in front of the projector and alternating between left- and right- handedness, in sync with the left or right image being displayed by the (digital) movie projector. The audience wears passive circularly polarized glasses.

Infitec glasses

Infitec stands for interference filter technology. Special interference filters (dichromatic filters) in the glasses and in the projector form the main item of technology and have given it this name. The filters divide the visible color spectrum into six narrow bands – two in the red region, two in the green region, and two in the blue region (called R1, R2, G1, G2, B1 and B2 for the purposes of this description). The R1, G1 and B1 bands are used for one eye image, and R2, G2, B2 for the other eye. The human eye is largely insensitive to such fine spectral differences so this technique is able to generate full-color 3D images with only slight colour differences between the two eyes.[16] Sometimes this technique is described as a "super-anaglyph" because it is an advanced form of spectral-multiplexing which is at the heart of the conventional anaglyph technique.

Dolby uses a form of this technology in its Dolby 3D theatres.

Inficolor 3D

Developed by TriOviz, Inficolor 3D is a patent pending stereoscopic system, first demonstrated at the International Broadcasting Convention in 2007 and deployed in 2010. It works with traditional 2D flat panels and HDTV sets and uses expensive glasses with complex color filters and dedicated image processing that allow natural color perception with a 3D experience. When observed without glasses, some slight doubling can be noticed in the background of the action which allows watching the movie or the video game in 2D without the glasses. This is not possible with traditional brute force anaglyphic systems.[17]

Inficolor 3D is a part of TriOviz for Games Technology, developed in partnership with TriOviz Labs and Darkworks Studio. It works with Sony PlayStation 3 (Official PlayStation 3 Tools & Middleware Licensee Program)[18] and Microsoft Xbox 360 consoles as well as PC.[19][20] TriOviz for Games Technology was showcased at Electronic Entertainment Expo 2010 by Mark Rein (vice president of Epic Games) as a 3D tech demo running on an Xbox 360 with Gears of War 2.[21] In October 2010 this technology has been officially integrated in Unreal Engine 3,[19][20] the computer game engine developed by Epic Games.

Video games equipped with TriOviz for Games Technology are: "Batman Arkham Asylum: Game of the Year Edition" for PS3 and Xbox 360 (March 2010),[22][23][24] "Enslaved: Odyssey to the West + DLC Pigsy's Perfect 10" for PS3 and Xbox 360 (Nov. 2010),[25][26] "Thor: God of Thunder" for PS3 and Xbox 360 (May 2011), "Green Lantern: Rise of the Manhunters" for PS3 and Xbox 360 (June 2011), "Captain America: Super Soldier" for PS3 and Xbox 360 (July 2011). "Gears of War 3" for Xbox 360 (September 2011), "Batman: Arkham City" for PS3 and Xbox 360 (October 2011), "Assassin's Creed: Revelations" for PS3 and Xbox 360 (November 2011). The first DVD/Blu-ray including Inficolor 3D Tech is: "Battle for Terra 3D" (published in France by Pathé & Studio 37 - 2010).

Complementary color anaglyphs

Complementary color anaglyphs employ one of a pair of complementary color filters for each eye. The most common color filters used are red and cyan. Employing tristimulus theory, the eye is sensitive to three primary colors, red, green, and blue. The red filter admits only red, while the cyan filter blocks red, passing blue and green (the combination of blue and green is perceived as cyan). If a paper viewer containing red and cyan filters is folded so that light passes through both, the image will appear black. Another recently introduced form employs blue and yellow filters. (Yellow is the color perceived when both red and green light passes through the filter.)

Anaglyph images have seen a recent resurgence because of the presentation of images on the Internet. Where traditionally, this has been a largely black & white format, recent digital camera and processing advances have brought very acceptable color images to the internet and DVD field. With the online availability of low cost paper glasses with improved red-cyan filters, and plastic framed glasses of increasing quality, the field of 3D imaging is growing quickly. Scientific images where depth perception is useful include, for instance, the presentation of complex multi-dimensional data sets and stereographic images of the surface of Mars. With the recent release of 3D DVDs, they are more commonly being used for entertainment. Anaglyph images are much easier to view than either parallel sighting or crossed eye stereograms, although these types do offer more bright and accurate color rendering, most particularly in the red component, which is commonly muted or desaturated with even the best color anaglyphs. A compensating technique, commonly known as Anachrome, uses a slightly more transparent cyan filter in the patented glasses associated with the technique. Processing reconfigures the typical anaglyph image to have less parallax to obtain a more useful image when viewed without filters.

Compensating diopter glasses for red-cyan method

Simple sheet or uncorrected molded glasses do not compensate for the 250 nanometer difference in the wave lengths of the red-cyan filters. With simple glasses, the red filter image can be blurry when viewing a close computer screen or printed image since the retinal focus differs from the cyan filtered image, which dominates the eyes' focusing. Better quality molded plastic glasses employ a compensating differential diopter power to equalize the red filter focus shift relative to the cyan. The direct view focus on computer monitors has been recently improved by manufacturers providing secondary paired lenses fitted and attached inside the red-cyan primary filters of some high end anaglyph glasses. They are used where very high resolution is required, including science, stereo macros, and animation studio applications. They use carefully balanced cyan (blue-green) acrylic lenses, which pass a minute percentage of red to improve skin tone perception. Simple red/blue glasses work well with black and white, but the blue filter is unsuitable for human skin in color.

ColorCode 3D

ColorCode 3D is a newer, patented[27] stereo viewing system deployed in the 2000s that uses amber and blue filters. Notably, unlike other anaglyph systems, ColorCode 3D is intended to provide perceived nearly full colour viewing (particularly within the RG color space) with existing television and paint mediums. One eye (left, amber filter) receives the cross-spectrum colour information and one eye (right, blue filter) sees a monochrome image designed to give the depth effect. The human brain ties both images together.

Images viewed without filters will tend to exhibit light-blue and yellow horizontal fringing. The backwards compatible 2D viewing experience for viewers not wearing glasses is improved, generally being better than previous red and green anaglyph imaging systems, and further improved by the use of digital post-processing to minimise fringing. The displayed hues and intensity can be subtly adjusted to further improve the perceived 2D image, with problems only generally found in the case of extreme blue.

The blue filter is centred around 450 nm and the amber filter lets in light at wavelengths at above 500 nm. Wide spectrum colour is possible because the amber filter lets through light across most wavelengths in spectrum. When presented via RGB color model televisions, the original red and green channels from the left image are combined with a monochrome blue channel formed by averaging the right image with the weights \{r:0.15, g:0.15, b:0.7\}.

In the United Kingdom, television station Channel 4 commenced broadcasting a series of programmes encoded using the system during the week of 16 November 2009.[28] Previously the system had been used in the United States for an "all 3-D advertisement" during the 2009 Super Bowl for SoBe, Monsters vs. Aliens animated movie and an advertisement for the Chuck television series in which the full episode the following night used the format.

Chromadepth method and glasses

The ChromaDepth procedure of American Paper Optics is based on the fact that with a prism, colors are separated by varying degrees. The ChromaDepth eyeglasses contain special view foils, which consist of microscopically small prisms. This causes the image to be translated a certain amount that depends on its color. If one uses a prism foil now with one eye but not on the other eye, then the two seen pictures – depending upon color – are more or less widely separated. The brain produces the spatial impression from this difference. The advantage of this technology consists above all of the fact that one can regard ChromaDepth pictures also without eyeglasses (thus two-dimensional) problem-free (unlike with two-color anaglyph). However the colors are only limitedly selectable, since they contain the depth information of the picture. If one changes the color of an object, then its observed distance will also be changed.

Anachrome "compatible" color anaglyph method

A recent variation on the anaglyph technique is called "Anachrome method". This approach is an attempt to provide images that look fairly normal without glasses as 2D images to be "compatible" for posting in conventional websites or magazines. The 3D effect is generally more subtle, as the images are shot with a narrower stereo base, (the distance between the camera lenses). Pains are taken to adjust for a better overlay fit of the two images, which are layered one on top of another. Only a few pixels of non-registration give the depth cues. The range of color is perhaps three times wider in Anachrome due to the deliberate passage of a small amount of the red information through the cyan filter. Warmer tones can be boosted, and this is claimed to provide warmer skin tones and vividness.

Autostereoscopy

Autostereoscopy is any method of displaying stereoscopic (3D) images without the use of special headgear or glasses on the part of the viewer. Because headgear is not required, it is also called "glasses-free 3D". The technology includes two broad classes of displays: those that use head-tracking to ensure that each of the viewer's two eyes sees a different image on the screen, and those that display multiple views so that the display does not need to know where the viewers' eyes are directed. Examples of autostereoscopic displays include parallax barrier, lenticular, volumetric, electro-holographic, and light field displays.

Some autostereoscopic displays are also capable of recreating a perception of movement parallax, which is not possible with any of the active or passive technologies discussed above. "Movement parallax" refers to the fact that the view of a scene changes with movement of the head. Thus, different images of the scene are seen as the head is moved from left to right, and from up to down.

This is the method used by the Nintendo 3DS video game system and the Optimus 3D and LG Thrill by cellphone manufacturer LG Electronics MobileComm. USA..

A fundamentally new approach to autostereoscopy, called HR3D has been developed by researcher from MIT's Media Lab. It would consume 2 times less power, doubling the battery life if used with devices like the Nintendo 3DS, without compromising screen brightness or resolution. And having other advantages such as bigger viewing angle and it would maintain the 3D effect even when the screen is rotated.[29]

Other display methods

Autostereoscopic

Autostereoscopic display technologies use optical components in the display, rather than worn by the user, to enable each eye to see a different image. The optics split the images directionally into the viewer's eyes, so the display viewing geometry requires limited head positions that will achieve the stereoscopic effect. Automultiscopic displays provide multiple views of the same scene, rather than just two. Each view is visible from a different range of positions in front of the display. This allows the viewer to move left-right in front of the display and see the correct view from any position. Example technologies include parallax barriers and specular holography.

Computer-generated holography

Research into holographic displays has produced devices which are able to create a light field identical to that which would emanate from the original scene, with both horizontal and vertical parallax across a large range of viewing angles. The effect is similar to looking through a window at the scene being reproduced; this may make CGH the most convincing of the 3D display technologies, but as yet the large amounts of calculation required to generate a detailed hologram largely prevent its application outside of the laboratory.

Volumetric displays

Volumetric displays use some physical mechanism to display points of light within a volume. Such displays use voxels instead of pixels. Volumetric displays include multiplanar displays, which have multiple display planes stacked up, and rotating panel displays, where a rotating panel sweeps out a volume.

Other technologies have been developed to project light dots in the air above a device. An infrared laser is focused on the destination in space, generating a small bubble of plasma which emits visible light.

Taking the pictures

It is necessary to take two photographs for a stereoscopic image. This can be done with two cameras, with one camera moved quickly to two positions, or with a stereo camera incorporating two or more side-by-side lenses.

In the 1950s, stereoscopic photography regained popularity when a number of manufacturers began introducing stereoscopic cameras to the public. The new cameras were developed to use 135 film, which had gained popularity after the close of World War II. Many of the conventional cameras used the film for 35 mm transparency slides, and the new stereoscopic cameras utilized the film to make stereoscopic slides. The Stereo Realist camera was the most popular, and its 5P picture format became a standard. The stereoscopic cameras were marketed with special viewers that allowed for the use of such slides. With these cameras the public could easily create their own stereoscopic memories. Although their popularity has waned, some of these cameras are still in use today.

The 1980s saw a minor revival of stereoscopic photography extent when point-and-shoot stereo cameras were introduced. Most of these cameras suffered from poor optics and plastic construction, and were designed to produce lenticular prints, a format which never gained wide acceptance, so they never gained the popularity of the 1950s stereo cameras.

The beginning of the 21st century marked the coming of the age of digital photography. Stereo lenses were introduced which could turn an ordinary film camera into a stereo camera by using a special double lens to take two images and direct them through a single lens to capture them side-by-side on the film. Although current digital stereo cameras cost hundreds of dollars,[30] cheaper models also exist, for example those produced by the company Loreo. It is also possible to create a twin camera rig, together with a "shepherd" device to synchronize the shutter and flash of the two cameras. By mounting two cameras on a bracket, spaced a bit, with a mechanism to make both take pictures at the same time. Newer cameras are even being used to shoot "step video" 3D slide shows with many pictures almost like a 3D motion picture if viewed properly. A modern camera can take five pictures per second, with images that greatly exceed HDTV resolution.

If anything is in motion within the field of view, it is necessary to take both images at once, either through use of a specialized two-lens camera, or by using two identical cameras, operated as close as possible to the same moment.

A single camera can also be used if the subject remains perfectly still (such as an object in a museum display). Two exposures are required. The camera can be moved on a sliding bar for offset, or with practice, the photographer can simply shift the camera while holding it straight and level. This method of taking stereo photos is sometimes referred to as the "Cha-Cha" or "Rock and Roll" method.[31] It is also sometimes referred to as the "astronaut shuffle" because it was used to take stereo pictures on the surface of the moon using normal monoscopic equipment.[32]

For the most natural looking stereo most stereographers move the camera about 65mm or the distance between the eyes,[33] but some experiment with other distances. A good rule of thumb is to shift sideways 1/30th of the distance to the closest subject for 'side by side' display, or just 1/60th if the image is to be also used for color anaglyph or anachrome image display. For example, when enhanced depth beyond natural vision is desired and a photo of a person in front of a house is being taken, and the person is thirty feet away, then the camera should be moved 1 foot between shots.[33]

The stereo effect is not significantly diminished by slight pan or rotation between images. In fact slight rotation inwards (also called 'toe in') can be beneficial. Bear in mind that both images should show the same objects in the scene (just from different angles) - if a tree is on the edge of one image but out of view in the other image, then it will appear in a ghostly, semi-transparent way to the viewer, which is distracting and uncomfortable. Therefore, you can either crop the images so they completely overlap, or you can 'toe-in' the cameras so that the images completely overlap without having to discard any of the images. However, be a little cautious - too much 'toe-in' can cause eye strain for reasons best described here.[34]

Here you can find (in different languages) an excellent article about The Ten Commandments of Stereoscopy for taking good stereoscopy images (photo & video).

Base line selection

For general purpose stereo photography, where the goal is to duplicate natural human vision and give a visual impression as close as possible to actually being there, the correct baseline (distance between where the right and left images are taken) would be the same as the distance between the eyes.[35] When images taken with such a baseline are viewed using a viewing method that duplicates the conditions under which the picture is taken then the result would be an image pretty much the same as what you would see if you were actually there. This could be described as "ortho stereo."

An example would be the Realist format that was so popular in the late 1940s to mid 1950s and is still being used by some today. When these images are viewed using high quality viewers, or seen with a properly set up projector, the impression is, indeed, very close to what you would see if you were there.

The baseline used in such cases will be about 50mm to 80mm. This is what is generally referred to as a "normal" baseline, used in most stereo photography. There are, however, situations where it might be desirable to use a longer or shorter baseline. The factors to consider include the viewing method to be used and the goal in taking the picture.

Longer base line for distant objects "Hyper Stereo"

If a stereo picture is taken of a large, distant object such as a mountain or a large building using a normal base it will appear to be flat.[36] This is in keeping with normal human vision, it would look flat if you were actually there, but if the object looks flat, there doesn't seem to be any point in taking a stereo picture, as it will simply seem to be behind a stereo window, with no depth in the scene itself, much like looking at a flat photograph from a distance.

One way of dealing with this situation is to include a foreground object to add depth interest and enhance the feeling of "being there", and this is the advice commonly given to novice stereographers.[37][38] Caution must be used, however, to ensure that the foreground object is not too prominent, and appears to be a natural part of the scene, otherwise it will seem to become the subject with the distant object being merely the background.[39] In cases like this, if the picture is just one of a series with other pictures showing more dramatic depth, it might make sense just to leave it flat, but behind a window.[39]

For making stereo images featuring only a distant object (e.g., a mountain with foothills), the camera positions can be separated by a larger distance (commonly called the "interocular" or stereo base) than the adult human norm of 62-65mm. This will effectively render the captured image as though it was seen by a giant, and thus will enhance the depth perception of these distant objects, and reduce the apparent scale of the scene proportionately.[40] However, in this case care must be taken not to bring objects in the close foreground too close to the viewer, as they will show excessive parallax and can complicate stereo window adjustment.

There are two main ways to accomplish this. One is to use two cameras separated by the required distance, the other is to shift a single camera the required distance between shots.

The shift method has been used with cameras such as the Stereo Realist to take hypers, either by taking two pairs and selecting the best frames, or by alternately capping each lens and recocking the shutter.[36][41]

It is also possible to take hyperstereo pictures using an ordinary single lens camera aiming out an airplane. One must be careful, however, about movement of clouds between shots.[42]

It has even been suggested that a version of hyperstereo could be used to help pilots fly planes.[43]

In such situations, where an ortho stereo viewing method is used, a common rule of thumb is the 1:30 rule.[44] This means that the baseline will be equal to 1/30 of the distance to the nearest object included in the photograph.

The results of hyperstereo can be quite impressive,[45][46][47] and examples of hyperstereo can be found in vintage views.[48]

This technique can be applied to 3D imaging of the Moon: one picture is taken at moonrise, the other at moonset, as the face of the Moon is centered towards the center of the Earth and the diurnal rotation carries the photographer around the perimeter, though the results are rather poor,[49] and much better results can be obtained using alternative techniques.[49]

This is why high quality published stereos of the moon are done using libration,[50][51] [52][53] the slight "wobbling" of the moon on its axis relative to the earth.[54] Similar techniques were used late in the 19th century to take stereo views of Mars and other astronomical subjects.[54]

Limitations of hyperstereo

Vertical alignment can become a big problem, especially if the terrain on which the two camera positions are placed is uneven.

Movement of objects in the scene can make syncing two widely separated cameras a nightmare. When a single camera is moved between two positions even subtle movements such as plants blowing in the wind and the movement of clouds can become a problem.[41] The wider the baseline, the more of a problem this becomes.

Pictures taken in this fashion take on the appearance of a miniature model, taken from a short distance,[55][56][57] and those not familiar with such pictures often cannot be convinced that it is the real object. This is because we cannot see depth when looking at such scenes in real life and our brains aren't equipped to deal with the artificial depth created by such techniques, and so our minds tell us it must be a smaller object viewed from a short distance, which would have depth. Though most eventually realize it is, indeed, an image of a large object from far away, many find the effect bothersome.[58] This doesn't rule out using such techniques, but it is one of the factors that need to be considered when deciding whether or not such a technique should be used.

Hyper stereo can also lead to cardboarding, an effect that creates stereos in which different objects seem well separated in depth, but the objects themselves seem flat. This is because parallax is quantised.[59]

Illustration of the limits of parallax multiplication, refer to image at left. Ortho viewing method assumed. The line represents the Z axis, so imagine that it is laying flat and stretching into the distance. If the camera is at X point A is on an object at 30 feet. Point B is on an object at 200 feet and point C is on the same object but 1 inch behind B. Point D is on an object 250 feet away. With a normal baseline point A is clearly in the foreground, with B,C, and D all at stereo infinity. With a one foot base line, which multiplies the parallax, there will be enough parallax to separate all four points, though the depth in the object containing B and C will still be subtle. If this object is the main subject, we may consider a baseline of 6 feet 8 inches but then the object at A would need to be cropped out. Now imagine that the camera is point Y, now the object at A is at 2,000 feet, point B is on an object at 2,170 feet C is a point on the same object 1 inch behind B. Point D is on an object at 2,220 feet. With a normal baseline, all four points are now at stereo infinity. With a 67 foot basline, the multiplied parallax allows us to see that all three objects are on different planes, yet points B and C, on the same object, appear to be on the same plane and all three objects appear flat. This is because there are discrete units of parallax, so at 2,170 feet the parallax between B and C is zero and zero multiplied by any number is still zero.

A practical example

In the red-cyan anaglyph example below, a ten-meter baseline atop the roof ridge of a house was used to image the mountain. The two foothill ridges are about four miles (6.5 km) distant and are separated in depth from each other and the background. The baseline is still too short to resolve the depth of the two more distant major peaks from each other. Owing to various trees that appeared in only one of the images the final image had to be severely cropped at each side and the bottom.

In the wider image, taken from a different location, a single camera was walked about one hundred feet (30 m) between pictures. The images were converted to monochrome before combination.(below)


Shorter baseline for ultra closeups "Macro stereo"

Right frame 
Closeup stereo of a cake photographed using a Fuji W3. Taken by backing off several feet and then zooming in.

When objects are taken from closer than about 6 1/2 feet a normal base will produce excessive parallax and thus exaggerated depth when using ortho viewing methods. At some point the parallax becomes so great that the image is difficult or even impossible to view. For such situations, it becomes necessary to reduce the baseline in keeping with the 1:30 rule.

When still life scenes are stereographed, an ordinary single lens camera can be moved using a slide bar or similar method to generate a stereo pair. Multiple views can be taken and the best pair selected for the desired viewing method.

For moving objects, a more sophisticated approach is used. In the early 1970s, Realist incorporated introduced the Macro Realist designed to stereograph subjects 4 to 5 1/2 inches away, for viewing in Realist format viewers and projectors. It featured a 15mm base and fixed focus.[60] It was invented by Clarence G. Henning.[61]

In recent years cameras have been prodcued which are designed to stereograph subjects 10" to 20" using print film, with a 27mm baseline.[62] Another technique, usable with fixed base cameras such as the Fujifilm FinePix Real 3D W1/W3 is to back off from the subject and use the zoom function to zoom to a closer view, such as was done in the image of a cake. This has the effect of reducing the effective baseline. Similar techniques could be used with paired digital cameras.

Another way to take images of very small objects, "extreme macro", is to use an ordinary flatbed scanner. This is a variation on the shift technique in which the object is turned upside down and placed on the scanner, scanned, moved over and scanned again. This produces stereos of a range objects as large as about 6" across down to objects as small as a carrot seed. This technique goes back to at least 1995. See the article Scanography for more details.

Baseline tailored to viewing method

How far the picture is viewed from requires a certain separation between the cameras. This separation is called stereo base or stereo base line and results from the ratio of the distance to the image to the distance between the eyes (usually about 2.5 inches). In any case the farther the screen is viewed from the more the image will pop out. The closer the screen is viewed from the flatter it will appear. Personal anatomical differences can be compensated for by moving closer or farther from the screen.

To provide close emulation of natural vision for images viewed on a computer monitor,a fixed stereo base of 6 cm might be appropriate. This will vary depending on the size of the monitor and the viewing distance. For hyper stereo, a ratio smaller than 1:30 could be used. For example if a stereo image is to be viewed on a computer monitor from a distance of 1000 mm there will be an eye to view ratio of 1000/63 or about 16. To set the cameras the appropriate distance apart for the desired effect, the distance to the subject (say a person at a distance from the cameras of 3 meters) is divided by 16 which yields a stereo base of 188 mm between the cameras.

However, images optimized for a small screen viewed from a short distance will show excessive parallax when viewed with more ortho methods, such as a projected image or a head mounted display, possibly causing eyestrain and headaches, or doubling, so pictures optimized for this viewing method may not be usable with other methods.

Where images may also be used for anaglyph display a narrower base, say 40mm or a variable base of 1:50 or 1:60 will allow for less ghosting in the display.

Variable Base for "Geometric Stereo"

As mentioned previously, the goal of the photographer may be a reason for using a baseline that is larger than normal. Such is the case when, instead of trying to achieve a close emulation to natural vision, a stereographer may be trying to achieve geometric perfection. This approach means that objects are shown with the shape they actually have, rather than the way they are seen by humans.

Objects at 25 to 30 feet, instead of having the subtle depth that you would see if you were actually there, or what would be recorded with a normal baseline, will have the much more dramatic depth that would be seen from 7 to 10 feet. In other words, the baseline is chosen to produce the same depth effect, regardless of the distance from the subject. As with true ortho, this effect is impossible to achieve in a literal sense, since different objects in the scene will be at different distances and will thus show different amounts of parallax, but the geometric stereographer, like the ortho stereographer attempts to come as close as possible.

Achieving this could be as simple as using the 1:30 rule to find a custom base for every shot, regardless of distance, or it could involve using a more complicated formula.[63]

This could be thought of as a form of hyperstereo,[64] but less extreme. As a result, it has all of the same limitations of hyperstereo. When objects are given enhanced depth, but not magnified to take up a larger portion of the view, there is a certain miniaturization effect. Of course, this may be exactly what the stereographer has in mind.

While geometric stereo neither attempts nor achieves a close emulation of natural vision, there are valid reasons for this approach. It does, however, represent a very specialized branch of stereography.

Precise stereoscopic baseline calculation methods

Recent research has led to precise methods for calculating the stereoscopic camera baseline.[65] These techniques consider the geometry of the display/viewer and scene/camera spaces independently and can be used to reliably calculate a mapping of the scene depth being captured to a comfortable display depth budget. This frees up the photographer to place their camera wherever they wish to achieve the desired composition and then use the baseline calculator to work out the camera inter-axial separation required to produce a high quality 3D image.

This approach means there is no guess work in the stereoscopic setup once a small set of parameters have been measured, it can be implemented for photography and computer graphics and the methods can be easily implemented in a software tool.

Multi-rig stereoscopic cameras

The precise methods for camera control have also allowed the development of multi-rig stereoscopic cameras where different slices of scene depth are captured using different inter-axial settings,[66] the images of the slices are then composed together to form the final stereoscopic image pair. This allows important regions of a scene to be given better stereoscopic representation while less important regions are assigned less of the depth budget. It provides stereographers with a way to manage composition within the limited depth budget of each individual display technology.

Stereoscopic motion measurement (6D-Vision)

Typical traffic scene: A person runs onto the street behind a standing car.
Result of the 6D-Vision Algorithm. The arrows point to the predicted position in 0.5 seconds.

The classical stereoscopy measures the three spatial coordinates (3D-Position) of corresponding points from a pair of images. Many applications require clustering of 3D point clouds into distinct objects. This can cause severe problems if objects are too close. For example, the child entering the street behind the car as shown in the picture above can only be separated by its motion. 6D-Vision[67] tracks points with known depth from stereo over two or more consecutive images and fuses the data. The result is an improved accuracy of the 3D-position and an estimation of the 3D-motion (velocity and direction) of the considered points at the same time. The 6D information (3D-position + 3D-motion = 6D-Vision) allows predicting the trajectory of objects and detecting potential collisions. A result of this principle is shown in the image above, the arrows indicate the expected object position within 0.5 seconds. More details are given on the homepage of the 6D-Vision developers.[68]

6D-Vision is also applied for perception of gestures, the motion of human limbs, without modeling the shape of persons with just using a passive stereo camera.[69]

See also

Technological

Basics:
Formats which can represent 3D stereo images:
Commercialized 3D display technologies:

Medical:

Historical and cultural:

Video game industry:
Places, associations, companies:
People:
Marketing terms:
  • 4D film - Marketing term for a 3D film plus audience-interactive effects

References

  1. ^ Dodgson, N.A. (August 2005). "Autostereoscopic 3D Displays". IEEE Computer 38 (8): 31–36. doi:10.1109/MC.2005.252. ISSN 0018-9162. 
  2. ^ Flight Simulation, J. M. Rolfe and K. J. Staples, Cambridge University Press, 1986, page 134
  3. ^ a b Contributions to the Physiology of Vision.—Part the First. On some remarkable, and hitherto unobserved, Phenomena of Binocular Vision. By CHARLES WHEATSTONE, F.R.S., Professor of Experimental Philosophy in King's College, London. Stereoscopy.com
  4. ^ Welling, William. Photography in America, page 23
  5. ^ Stereo Realist Manual, p. 375.
  6. ^ Stereo Realist Manual, pp. 377-379.
  7. ^ Fay Huang, Reinhard Klette, and Karsten Scheibe: Panoramic Imaging (Sensor-Line Cameras and Laser Range-Finders). Wiley & Sons, Chichester, 2008
  8. ^ Dornaika, F.; Hammoudi, K (2009). "Extracting 3D Polyhedral Building Models from Aerial Images using a Featureless and Direct Approach" (PDF). Proc. IAPR/MVA. http://www.mva-org.jp/Proceedings/2009CD/papers/12-02.pdf. Retrieved 2010-09-26. 
  9. ^ στερεός Tufts.edu, Henry George Liddell, Robert Scott, A Greek-English Lexicon, on Perseus Digital Library
  10. ^ σκοπέω, Henry George Liddell, Robert Scott, A Greek-English Lexicon, on Perseus Digital Library
  11. ^ The Logical Approach to Seeing 3D Pictures. www.vision3d.com by Optometrists Network. Retrieved 2009-08-21
  12. ^ How To Freeview Stereo (3D) Images. Greg Erker. Retrieved 2009-08-21
  13. ^ How to View Photos on This Site. Stereo Photography - The World in 3D. Retrieved 2009-08-21
  14. ^ Sega Master System
  15. ^ Olin Coles. "NVIDIA GeForce 3D Vision Gaming Kit". BenchmarkReviews.com. http://benchmarkreviews.com/index.php?option=com_content&task=view&id=276&Itemid=58. Retrieved 2009-01-08. 
  16. ^ Jorke, Helmut; Fritz M. (2006). "Stereo projection using interference filters". Stereoscopic Displays and Applications Proc. SPIE 6055. http://spie.org/x648.xml?product_id=650348. Retrieved 2008-11-19. 
  17. ^ Digitalcinemareport.com The Games We Play by Michael Karagosian
  18. ^ PRnewswire.com, TriOviz for Games Adds 3D TV Support for Console Titles
  19. ^ a b Joystiq.com, Epic's Mark Rein goes in-depth with Unreal Engine 3's TriOviz 3D
  20. ^ a b Epicgames.com, TriOviz for Games Technology Brings 3D Capabilities to Unreal Engine 3
  21. ^ Computer and Video Games.com: E3 2010: Epic makes 3D Gears Of War 2 - We've seen it. It's mega. But retail release not planned (17-Jun-2010)
  22. ^ Engadget.com Darkworks shows off TriOviz for Games 2D-to-3D SDK, we get a good look
  23. ^ Spong.com, Reviews of Batman Arkham Asylum Game of the Year Edition in 3D
  24. ^ Batmanarkhamasylum.com, How do you add another dimension to one of the best games of 2009?
  25. ^ Enslaved.namco.com Pigsy's DLC in 3D
  26. ^ Gamesradar.com Enslaved: Pigsy's DLC review
  27. ^ Sorensen, Svend Erik Borre; Hansen, Per Skafte; Sorensen, Nils Lykke (2001-05-31). "Method for recording and viewing stereoscopic images in color using multichrome filters". United States Patent 6687003. Free Patents Online. http://www.freepatentsonline.com/6687003.html. 
  28. ^ "Announcements". 3D Week. 2009-10-11. http://www.channel4.com/programmes/3d-week/articles/3d-week. Retrieved 2009-11-18. "glasses that will work for Channel 4’s 3D week are the Amber and Blue ColourCode 3D glasses" 
  29. ^ http://www.physorg.com/news/2011-05-glasses-free-d-fundamentally-approach.html
  30. ^ Fuji W3
  31. ^ Mac Digital Photography - 2003, Wiley, p.125, Dennis R. Cohen, Erica Sadun - 2003
  32. ^ Stereo World, National Stereoscopic Association Vol 17 #3 pp. 4-10
  33. ^ a b The chacha method
  34. ^ How a 3-D Stereoscopic Movie is made – 3-D Revolution Productions
  35. ^ Dr. T
  36. ^ a b Stereo Realist Guide, by Kenneth Tydings,Greenberg, 1951 page 100
  37. ^ Stereo Realist Manual, p. 27.
  38. ^ Stereo Realist Manual, p. 261.
  39. ^ a b Stereo Realist Manual, p. 156.
  40. ^ BUCKINGHAM PALACE IN HYPERSTEREO
  41. ^ a b Stereo World Volume 37 #1 Inside Front Cover
  42. ^ Stereoworld Vol 21 #1 March/April 1994 IFC, 51
  43. ^ Stereoworld Vol 16 #1 March/April 1989 pp 36-37
  44. ^ Lens separation in stereo photography
  45. ^ Stereoworld Vol 16 #2 May/June 1989 pp 20-21
  46. ^ Stereoworld Vol 8 #1 March/April 1981 pp 16-17
  47. ^ Stereoworld Vol 31 #6 May/June 2006 pp 16-22
  48. ^ Stereoworld Vol 17 #5 Nov/DEC 1990 pp 32-33
  49. ^ a b Stereo Lunar Photos by John C. Ballou An in depth looks at mooon stereos with examples using several techniques
  50. ^ Stereoworld Vol 23 #2 May/June 1996 pp 25-30
  51. ^ Stereo moon photo
  52. ^ Brians Soapbox February 2009
  53. ^ London Stereoscopic Company - Official Web Site a more indepth explanation
  54. ^ a b Stereoworld Vol 15 #3 July/August 1988 pp 25-30
  55. ^ Stereo Realist Guide, by Kenneth Tydings,Greenberg, 1951 page 101
  56. ^ The Vision of Hyperspace, Arthur Chandler, 1975, Stereo World , vol 2 #5 pp 2-3,12
  57. ^ Historical World Trade Center Photographs
  58. ^ Hyperspace a comment, Paul Wing, 1976, Stereo World , vol 2 #6 page 2
  59. ^ Cardboarding
  60. ^ Willke & Zakowski
  61. ^ Simmons
  62. ^ The 3D Mac
  63. ^ Bercovitz Formulae for stereo base
  64. ^ Rocky Mountain Memories
  65. ^ Jones, G.R.; Lee, D., Holliman, N.S., Ezra, D. (2001). "Controlling perceived depth in stereoscopic images" (PDF). Stereoscopic Displays and Applications Proc. SPIE 4297A. http://www.dur.ac.uk/n.s.holliman/Presentations/EI4297A-07Protocols.pdf. 
  66. ^ Holliman, N. S. (2004). "Mapping perceived depth to regions of interest in stereoscopic images" (PDF). Stereoscopic Displays and Applications Proc. SPIE 5291. http://www.dur.ac.uk/n.s.holliman/Presentations/EI5291A-12.pdf. 
  67. ^ “6D-Vision: Fusion of Stereo and Motion for Robust Environment Perception”, Uwe Franke, Clemens Rabe, Hernán Badino, Stefan Gehrig, Daimler Chrysler AG, DAGM Symposium 2005 Springerlink.com
  68. ^ 6D-Vision Homepage
  69. ^ "Gesture-perception mit 6D-Vision", 3Vi GmbH, 3-vi.com

External links